Skip to content

Conversation

Shimuuar
Copy link
Contributor

Implementation follows plan described in #477 with generateA :: Applicative f => Int -> (Int -> f a) -> f (v a) as primitive. It is not subject to stream fusion.

Writing generateA is not difficult. Problem is doing it efficiently. Current benchmarks are (suggestions for more are very welcome):

  1. Generate vector of random numbers using state monad
  2. Generate vector using IO action
  3. Compute sum of vector using lens
  4. Map vector using lens

First and naive implementation uses list as intermediate data structure. Sum benchmark performs well in this case. Using STA instead brings map benchmark on par with explicit loop and produces slight (5-10%) improvements in state and IO benchmark)

newtype STA v a = STA {  _runSTA :: forall s. Mutable v s a -> ST s (v a) }

Currently sum and map perform on par with explicit loop. State gives 7x slowdown and 8x allocations, IO benchmark 4x slowdown and 8x allocations. We obviously can add rewrite rules for IO/ST but maybe there're more general optimizations.


Fixes #477, #69, #132, #144

Copy link
Contributor

@konsumlamm konsumlamm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about for/for_/traverse_ etc?

@Shimuuar
Copy link
Contributor Author

What about for/for_/traverse_ etc?

Good point! I forgot about them. But they're simpler since in this case one isn't in business of constructing vector so it simply reduces to fold.

@Shimuuar
Copy link
Contributor Author

Shimuuar commented Sep 3, 2025

I think that's about as far as reasonably possible for improving performance.

For many PrimMonads we would like to simply create buffer and just write to it. But it's not safe in general case, applicatives with backtracking can break referential transparency. And it's not clear how to express such rewrite rule and whether it's even possible. Also unstreamM suffers from same problem

Shimuuar and others added 12 commits September 5, 2025 09:30
generateA is used as primitive and all other functions are expressed in it
terms. First version goes through intermediate list.

This is simplest implementation possible and would serve as baseline for
further optimizations
We establish implementation which goes through list as baseline and the we can
try to optimize it.

Note definition of foldlOf'. It's different from definition in lens<=5.3.3
but it's absolutely necessary to get good perfomance in folds
Does wonders for traversals using Identity
No performance change in benchmarks
This is clearly not enough. There're many other types that will benefit form
same rewrite but we don't know how to do that. unstreamM suffers from same
problem.

Identity is important since it's used in lens for mapping (over) and rewrite
rule does improve performance: 10-20% in microbenchmark.
Relevant for TypeApplication and other function library use same convention
Now it works in the same way as generateM
@Shimuuar Shimuuar marked this pull request as ready for review September 5, 2025 08:33
@Shimuuar
Copy link
Contributor Author

Shimuuar commented Sep 5, 2025

I think PR is ready.

It may perform better for some applicatives
@Shimuuar Shimuuar merged commit 4b4eade into master Sep 9, 2025
55 of 56 checks passed
@Shimuuar Shimuuar deleted the traverse branch September 9, 2025 08:06
@treeowl
Copy link
Contributor

treeowl commented Sep 9, 2025

I think that's about as far as reasonably possible for improving performance.

For many PrimMonads we would like to simply create buffer and just write to it. But it's not safe in general case, applicatives with backtracking can break referential transparency. And it's not clear how to express such rewrite rule and whether it's even possible. Also unstreamM suffers from same problem

We definitely can write such rewrite rules for specific cases.

@Shimuuar
Copy link
Contributor Author

Shimuuar commented Sep 9, 2025

And for IO/ST there're such rules already. But it seems there's no way to write such rules compositionally.

If for example we want to add rewrite rules for WriterT/ReaderT/StateT over IO then there're exponentially many types for given stack depth. We probably need some variant of typeclass guided rewrites

Copy link
Contributor

@lehins lehins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not reviewing it earlier. Very busy time for me right now.
Looks great. I only had a few minor comment that don't necessarily need to be addressed

@@ -266,7 +266,11 @@ xs // ps = go xs ps' 0
go [] _ _ = []


withIndexFirst m f = m (uncurry f) . zip [0..]
-- withIndexFirst :: (Int -> a -> [a]) -> [a] -> [a]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot a comment

Suggested change
-- withIndexFirst :: (Int -> a -> [a]) -> [a] -> [a]


* [#522](https://github.com/haskell/vector/pull/522) API using Applicatives
added: `traverse` & friends.
* [#518](https://github.com/haskell/vector/pull/518) `UnboxViaStorable` added.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like an accidental indentation

Suggested change
* [#518](https://github.com/haskell/vector/pull/518) `UnboxViaStorable` added.
* [#518](https://github.com/haskell/vector/pull/518) `UnboxViaStorable` added.

-- Use fromListN to be more efficient in construction of resulting vector
-- Also behaves better with compact regions, preventing runtime exceptions
in Data.Vector.fromListN n Applicative.<$> Traversable.traverse f (toList xs)
traverse = traverse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have used the imported generic version instead for clarity. Otherwise at first glance it looks like bottom because of a recursive call. It is only when one looks at a qualified import for Traversable it becomes clear that it is not the case.

Suggested change
traverse = traverse
traverse = G.traverse

@Shimuuar
Copy link
Contributor Author

Sorry for not reviewing it earlier. Very busy time for me right now.

No problem. I you want to make a review but don't have time at the moment please leave note. I would wait

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API using Applicative (traverse et.al.)
4 participants